-
Notifications
You must be signed in to change notification settings - Fork 60
feat(kotlin): Implement complete metrics support for Kotlin language #1215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(kotlin): Implement complete metrics support for Kotlin language #1215
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements comprehensive metrics support for the Kotlin language by replacing stub implementations with full AST-based analysis. The implementation builds on a tree-sitter grammar upgrade (0.25.3 → 0.26.3) and switches from tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna.
Changes:
- Implements Kotlin-specific
CheckerandGettertraits for AST node recognition and Halstead classification - Adds complete implementations for all 11 metric modules (ABC, Cognitive, Cyclomatic, Exit, Halstead, LOC, NArgs, NOM, NPA, NPM, WMC)
- Introduces
traverse_childrenhelper method for multi-level AST navigation in NPA/NPM metrics
Reviewed changes
Copilot reviewed 45 out of 46 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml, enums/Cargo.toml | Updated tree-sitter to 0.26.3, switched Kotlin grammar to tree-sitter-kotlin-codanna 0.3.9, upgraded other grammar versions |
| src/langs.rs, enums/src/languages.rs | Updated Kotlin language binding from tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna |
| src/macros.rs, enums/src/macros.rs | Updated language macro to use new Kotlin grammar's language() function |
| src/checker.rs | Implemented Checker trait for Kotlin with comment, function, closure, call, and string detection |
| src/getter.rs | Implemented Getter trait with space kind identification and Halstead operator/operand classification |
| src/node.rs | Added traverse_children helper for navigating multi-level child paths |
| src/metrics/abc.rs | Implemented ABC metric with Kotlin-specific helper functions for condition counting |
| src/metrics/cognitive.rs | Implemented cognitive complexity with nesting-aware control structure tracking |
| src/metrics/cyclomatic.rs | Implemented cyclomatic complexity for Kotlin control flow |
| src/metrics/exit.rs | Implemented return statement detection |
| src/metrics/halstead.rs | Implemented Halstead metrics using generic compute_halstead function |
| src/metrics/loc.rs | Implemented lines of code with comment and logical line detection |
| src/metrics/nargs.rs | Implemented parameter counting with Kotlin-specific helper |
| src/metrics/npm.rs | Implemented public method counting with visibility modifier traversal |
| src/metrics/npa.rs | Implemented public attribute counting with visibility modifier traversal |
| src/metrics/wmc.rs | Extracted shared compute_wmc helper, implemented for Kotlin |
| src/tools.rs | Simplified parentheses in conditional expression |
| src/languages/language_*.rs | Updated language enum mappings for Rust, Python, JavaScript due to grammar updates |
| tree-sitter-* | Updated parser.h and parser.c files with @generated annotation, version bumps |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | Kotlin::PIPEPIPE | ||
| | Kotlin::UnaryExpression |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is_non_arg implementation includes PIPEPIPE (||) as a non-argument token, but this doesn't match the typical pattern from other language implementations which focus on delimiters and structural elements. The inclusion of UnaryExpression also seems unusual - other implementations typically only include punctuation tokens. This may cause issues with parameter counting in functions that use these operators.
| | Kotlin::PIPEPIPE | |
| | Kotlin::UnaryExpression |
| let mut has_boolean_content = matches!( | ||
| node.parent().unwrap().kind_id().into(), | ||
| BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement | ||
| ); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of unwrap() on node.parent() could cause a panic if the node doesn't have a parent. While this follows the existing Java implementation pattern, it's safer to use proper error handling. Consider using node.parent()? or an if let pattern to handle the None case gracefully, especially since this is new code being added.
| let mut has_boolean_content = matches!( | |
| node.parent().unwrap().kind_id().into(), | |
| BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement | |
| ); | |
| let mut has_boolean_content = node | |
| .parent() | |
| .map(|parent| { | |
| matches!( | |
| parent.kind_id().into(), | |
| BinaryExpression | IfExpression | WhileStatement | DoWhileStatement | ForStatement | |
| ) | |
| }) | |
| .unwrap_or(false); |
| let is_not_operator = matches!(node_kind, UnaryExpression) | ||
| && matches!(node.child(0).unwrap().kind_id().into(), BANG); | ||
|
|
||
| // Stops the exploration if the node is neither | ||
| // a parenthesized expression nor a `Not` operator | ||
| if !is_parenthesised_exp && !is_not_operator { | ||
| break; | ||
| } | ||
|
|
||
| // Sets the flag to true if a `Not` operator is found | ||
| // This is used to prove if a variable or a value returned by a method is actually boolean | ||
| // e.g. `return (!x);` | ||
| if !has_boolean_content && is_not_operator { | ||
| has_boolean_content = true; | ||
| } | ||
|
|
||
| // Parenthesized expressions and `Not` operators nodes | ||
| // always store their expressions in the children nodes of index one | ||
| // Reference: tree-sitter-kotlin grammar for parenthesized_expression and prefix_expression | ||
| node = node.child(1).unwrap(); | ||
| node_kind = node.kind_id().into(); |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of unwrap() on node.child(0) and node.child(1) could cause a panic if the expected children don't exist. While this mirrors the Java implementation, it assumes a specific AST structure that may not always hold. Consider adding guards or using proper error handling to prevent potential panics.
| add_cloc_lines(stats, start, end); | ||
| } | ||
| // Statement types for logical lines counting | ||
| ForStatement | WhileStatement | DoWhileStatement => { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LOC implementation only counts ForStatement, WhileStatement, and DoWhileStatement as logical lines. This appears incomplete compared to other language implementations which typically count various statement types including if statements, return statements, assignments, etc. Consider adding more statement types to provide accurate logical line counts.
| ForStatement | WhileStatement | DoWhileStatement => { | |
| ForStatement | |
| | WhileStatement | |
| | DoWhileStatement | |
| | IfExpression | |
| | WhenExpression | |
| | ReturnExpression | |
| | ThrowExpression | |
| | TryExpression | |
| | BreakExpression | |
| | ContinueExpression | |
| | Assignment => { |
|
|
||
| pub(crate) fn child(&self, pos: usize) -> Option<Node<'a>> { | ||
| self.0.child(pos).map(Node) | ||
| self.0.child(pos as u32).map(Node) |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion from usize to u32 via pos as u32 could cause issues on platforms where usize is larger than u32, though this is unlikely in practice. However, this change should be verified to ensure it doesn't break existing code that relies on being able to pass larger indices. Consider adding a comment explaining why this cast is safe or using a checked conversion.
| self.0.child(pos as u32).map(Node) | |
| let idx = u32::try_from(pos).ok()?; | |
| self.0.child(idx).map(Node) |
| // StringLiteral covers both line strings and multi-line strings in this grammar | ||
| RealLiteral | IntegerLiteral | HexLiteral | BinLiteral | CharacterLiteralToken1 | UniCharacterLiteralToken1 | ||
| | LiteralConstant | StringLiteral | StringContent | LambdaLiteral | FunctionLiteral | ||
| | ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral => { |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Halstead operand classification appears to be missing the Identifier node type, which is present in other language implementations (Python, JavaScript, TypeScript, Rust, C++, Java). This means that variable names and function names won't be counted as operands in Halstead metrics calculations. Consider adding Identifier and SimpleIdentifier to the operand match pattern.
| | ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral => { | |
| | ObjectLiteral | UnsignedLiteral | LongLiteral | BooleanLiteral | CharacterLiteral | |
| | Identifier | SimpleIdentifier => { |
| // The child node of index 3 contains the `condition` when | ||
| // the initialization expression is a variable declaration | ||
| // e.g. `for ( int i=0; `condition`; ... ) {}` | ||
| if let Some(condition) = node.child(3) { | ||
| match condition.kind_id().into() { | ||
| SEMI => { | ||
| // The child node of index 4 contains the `condition` when | ||
| // the initialization expression is not a variable declaration | ||
| // e.g. `for ( i=0; `condition`; ... ) {}` | ||
| if let Some(cond) = node.child(4) { | ||
| match cond.kind_id().into() { | ||
| CallExpression | Identifier | True | False | SEMI | RPAREN => { | ||
| stats.conditions += 1.; | ||
| } | ||
| ParenthesizedExpression | UnaryExpression => { | ||
| kotlin_inspect_container(&cond, &mut stats.conditions); | ||
| } | ||
| _ => {} | ||
| } | ||
| } | ||
| } | ||
| CallExpression | Identifier | True | False => { | ||
| stats.conditions += 1.; | ||
| } | ||
| ParenthesizedExpression | UnaryExpression => { | ||
| kotlin_inspect_container(&condition, &mut stats.conditions); | ||
| } | ||
| _ => {} |
Copilot
AI
Jan 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ForStatement handling in the ABC metric appears to assume Java/C-style for loops with initialization, condition, and update expressions at specific child indices (3 and 4). However, Kotlin's for loops follow a different syntax (for (item in collection)) and don't have the same structure. This implementation will likely not work correctly for Kotlin for-loops and may cause incorrect condition counting or potential panics if child indices don't exist.
| // The child node of index 3 contains the `condition` when | |
| // the initialization expression is a variable declaration | |
| // e.g. `for ( int i=0; `condition`; ... ) {}` | |
| if let Some(condition) = node.child(3) { | |
| match condition.kind_id().into() { | |
| SEMI => { | |
| // The child node of index 4 contains the `condition` when | |
| // the initialization expression is not a variable declaration | |
| // e.g. `for ( i=0; `condition`; ... ) {}` | |
| if let Some(cond) = node.child(4) { | |
| match cond.kind_id().into() { | |
| CallExpression | Identifier | True | False | SEMI | RPAREN => { | |
| stats.conditions += 1.; | |
| } | |
| ParenthesizedExpression | UnaryExpression => { | |
| kotlin_inspect_container(&cond, &mut stats.conditions); | |
| } | |
| _ => {} | |
| } | |
| } | |
| } | |
| CallExpression | Identifier | True | False => { | |
| stats.conditions += 1.; | |
| } | |
| ParenthesizedExpression | UnaryExpression => { | |
| kotlin_inspect_container(&condition, &mut stats.conditions); | |
| } | |
| _ => {} | |
| // Java/C-style for-loops have semicolons in their header (e.g. `for (i = 0; condition; i++)`). | |
| // Kotlin-style for-loops (`for (item in collection)`) do not have semicolons and also do not | |
| // contain a boolean condition in the same sense, so they should not contribute to the C metric. | |
| let mut has_semi = false; | |
| for i in 0..node.child_count() { | |
| if let Some(child) = node.child(i) { | |
| if matches!(child.kind_id().into(), SEMI) { | |
| has_semi = true; | |
| break; | |
| } | |
| } | |
| } | |
| if has_semi { | |
| // The child node of index 3 contains the `condition` when | |
| // the initialization expression is a variable declaration | |
| // e.g. `for ( int i=0; `condition`; ... ) {}` | |
| if let Some(condition) = node.child(3) { | |
| match condition.kind_id().into() { | |
| SEMI => { | |
| // The child node of index 4 contains the `condition` when | |
| // the initialization expression is not a variable declaration | |
| // e.g. `for ( i=0; `condition`; ... ) {}` | |
| if let Some(cond) = node.child(4) { | |
| match cond.kind_id().into() { | |
| CallExpression | Identifier | True | False | SEMI | RPAREN => { | |
| stats.conditions += 1.; | |
| } | |
| ParenthesizedExpression | UnaryExpression => { | |
| kotlin_inspect_container(&cond, &mut stats.conditions); | |
| } | |
| _ => {} | |
| } | |
| } | |
| } | |
| CallExpression | Identifier | True | False => { | |
| stats.conditions += 1.; | |
| } | |
| ParenthesizedExpression | UnaryExpression => { | |
| kotlin_inspect_container(&condition, &mut stats.conditions); | |
| } | |
| _ => {} | |
| } |
- Update tree-sitter from 0.25.4 to 0.26.3 - Switch tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna 0.3.9 - Update tree-sitter-javascript to 0.25.0 - Update tree-sitter-python to 0.25.0 - Update tree-sitter-rust to 0.24.0 - Regenerate language enums for all grammars - Fix Node::child() parameter type for tree-sitter 0.26 API
- Update tree-sitter from 0.25.4 to 0.26.3 - Switch tree-sitter-kotlin-ng to tree-sitter-kotlin-codanna 0.3.9 - Update tree-sitter-javascript to 0.25.0 - Update tree-sitter-python to 0.25.0 - Update tree-sitter-rust to 0.24.0 - Regenerate language enums for all grammars - Fix Node::child() to use u32 cast for tree-sitter 0.26 API - Add macro case for tree_sitter_kotlin_codanna::language()
eef9228 to
6a86e55
Compare
- Implement Checker trait for KotlinCode with AST node recognition - Implement Getter trait with space kind and Halstead classification - Implement all 11 metric modules: ABC, Cognitive, Cyclomatic, Exit, Halstead, LOC, NArgs, NOM, NPA, NPM, WMC - Add traverse_children helper method to Node for multi-level traversal
6a86e55 to
14fe85a
Compare
Summary
This PR implements complete metrics support for the Kotlin language, building on top of #1214 (tree-sitter 0.26.3 upgrade).
Changes:
Checkertrait for KotlinCode with AST node recognitionGettertrait for KotlinCode with space kind and Halstead classificationtraverse_childrenhelper method to Node for multi-level AST traversalBackground
The upstream repository includes Kotlin in the language enum via
mk_langs!macro, but usesimplement_metric_trait!to provide stub implementations that return empty/zero values. This PR replaces those stubs with real implementations that properly analyze Kotlin AST nodes.Implementation Decisions
1. AST Node Mapping (Checker)
Mapped tree-sitter-kotlin-codanna node types to the Checker trait methods:
is_commentLineComment,MultilineCommentis_func_spaceSourceFile,ClassDeclarationis_funcFunctionDeclarationis_closureLambdaLiteralis_stringStringLiteral,LineStringLiteral,MultiLineStringLiteralis_callCallExpressionis_non_argFunctionBody,ModifierList,Statements(parameter containers excluded)2. Halstead Classification (Getter)
Classified Kotlin tokens for Halstead metrics:
=,+=,-=,*=,/=,%=,&&,||,!,==,!=,<,>,<=,>=,+,-,*,/,%,++,--,?.,?:,!!,::,->,as,as?,is,!is,in,!in,..,..<this,super3. Cognitive Complexity
Implemented nesting-aware complexity following the same pattern as Java:
if,when,for,while,do-while,catch,&&,||else ifchains (no additional nesting penalty)4. ABC Metric Helper Functions
Created Kotlin-specific helper functions for ABC metric calculation:
kotlin_count_unary_conditions: Counts negation operators in conditionskotlin_inspect_container: Identifies class members vs local variablescompute_kotlin_args: Extracts parameter counts from function declarations5. traverse_children Method
Added a utility method to
Nodefor navigating multi-level child paths:This is used in NPA/NPM metrics to navigate from property/function declarations through modifier lists to find visibility modifiers.
6. WMC Computation
Extracted shared WMC computation logic into a reusable
compute_wmchelper function, allowing consistent weighted method complexity calculation across languages.Files Changed
src/checker.rssrc/getter.rssrc/metrics/abc.rssrc/metrics/cognitive.rssrc/metrics/cyclomatic.rssrc/metrics/exit.rssrc/metrics/halstead.rssrc/metrics/loc.rssrc/metrics/nargs.rssrc/metrics/npa.rssrc/metrics/npm.rssrc/metrics/wmc.rssrc/node.rsTest Plan
cargo checkDependencies
This PR depends on #1214 (tree-sitter 0.26.3 upgrade) which provides:
🤖 Generated with Claude Code